-
Notifications
You must be signed in to change notification settings - Fork 644
Refactor json_schema.py, implement JSON Schema to YAML
#1182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
db73b4c to
c939c70
Compare
ce488b6 to
3a28324
Compare
3a28324 to
9648b30
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The JSON schema code is now in outlines-core. Unless there was an issue opened for that to be done separately, it looks like we missed it in #1175.
These changes will need to be moved and—potentially—ported to Rust. At the very least, we can't have two versions of the same basic JSON schema logic.
9648b30 to
3efb728
Compare
|
Seems there was a miscommunication. Thanks for clarifying @brandonwillard, I'll get started on porting the changes to Rust. |
3efb728 to
99fa1ec
Compare
Overview
Refactor
json_schema.pyto be more coherent and extensible. Use extensibility to implement JSON Schema to YAML.Changes
to_regexinto a classJSONSchemaRegexGeneratorwith visitors which implement JSON Schema rules, and formatters which implement pattern construction.YAMLRegexGeneratorby subclassingJSONSchemaRegexGeneratorand overriding some formatters.Tests:
test_json_schema.pyso it's existing tests also apply to YAML.anyOfandallOf)test_generate.py::test_generate_json, test both json and yaml modes.Behavioral Changes
The only behavior changes are:
NotImplementedErroranyOf,allOf,oneOfanyOf: Previously broken, now ORs sub-patternsallOf: Previously broken, now ANDs sub-patterns via positive lookaheadoneOf: Warns user that it's usinganyOfinstead, and callsanyOfThe rules are much closer to the JSON Schema spec with
main, however JSON Schema spec isn't always desirable. Users can legalize the JSON Schema compliant validation rules viastrict_json_schema_subset=False, resulting in:items: If unspecified, allow additional items without constraintsproperties: If unspecified, allow additional properties without constraintsjson-schema.org test suite
This is a large change-set. To verify correctness, in addition to ensuring current tests pass,
test_json_schema_full.pytests compliance with JSON Schema by retrieving 1,245 test cases from the official json-schema.org test suite.mainNotImplementedError(acceptable: visible)Raising
NotImplementedErrormakes it clear to the user why a schema would fail during generation, and it does so before generation.test_json_schema_to_yaml_complianceFor each of the 263 tests which pass in
test_json_schema_to_json_compliance, we test to verify their corresponding yaml pattern is also correct.TODO
json_schemaso its clean and extensibletest_json_schema_full.pyto yamlUpdate docs to reflect new behaviour surrounding JSON Schema spec-compliant implementationstrict_json_schema_subsetFurther Work
json_schema.pydoes too much. This new structure makes separation of concerns clear, easing a refactor.JSONSchemaRegexGenerator.to_automata(...)Not using a pattern intermediate would simplify things.NotImplementedcomponents based on users opening issues.